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Abstract 

In [9], J. Patarin designed a new scheme, called a oil and vinegar", for computing asymmetric 
signatures. It is very simple, can be computed very fast (both in secret and public key) and requires 
very little RAM in srnartcard implementations. The idea consists in hiding quadratic equations in n 
unknowns called "oil" and v = n unknowns called 'Vinegar" over a finite field K, with linear secret 
functions. This original scheme was. broken in [5] by A. Kipnis and A. Shamir. In this paper, we 
study some very simple variations of the original scheme where v > n (instead of v = n). These 
schemes are called "Unbalanced Oil and Vinegar 7 ' (UOV), since we have more "vinegar" unknowns 
than "oil" unknowns. We show that, when van, the attack of [5] can be extended, but when 
v >2n for example, the security of the scheme is still an open problem. Moreover, when v ~ the 
security of the scheme is exactly equivalent (if we accept a very natural but not proved property) to 
the problem of solving a random set of n quadratic equations in unknowns (with no trapdoor). 
However, we show that when tz > n 2 , finding a solution is generally easy. In .this paper, we also 
present some practical values of the parameters, for which no attacks are known. The length of 
the signatures can be as short as 192 bits. We also study schemes with public keys of degree three 
instead of two. We show that no significant advantages exist at the present to recommend schemes 
of degree three instead of two. 

1 Introduction 

Since 1985, various authors (see [2], [4], [7], [8], [9], [10], [11] for example) have suggested some public 
key schemes where the public key is given as a set of multivariate quadratic (or higher degree) equations 
over a small finite field K, ■ 
The general problem of solving such a set of equations is NP-hard (cf [3]) (even in the quadratic case). 
Moreover, when the number of unknowns is, say, n > 16, the best known algorithms are often not 
significantly better than exhaustive search (when n is very. small, Grobner bases algorithms might.be 

efficient). . ■ 

The schemes are often very efficient in terms of speed or RAM required in a srnartcard implementation 
(however, the length of the public key is generally > 1 Kbyte). The most serious problem is that, m 
order to introduce a trapdoor (to allow the computation of signatures or to allow the decryption of 
messages when a secret is known), the generated set of public equations generally becomes a small 
subset of all the possible equations and, in many cases, the algorithms have been broken. For example 
[2] was broken by their authors, and [7] and [9] were broken. However, many schemes are still not 
broken (for example [8], [10], [11]), and also in many cases, some very simple variations have been 
suggested in order to repair the schemes. Therefore, at the present, we do not know whether this idea 
of designing public key algorithms with multivariate polynomials over finite fields is a very powerful 
idea (where only some too simple schemes are insecure) or not. 
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.In this paper, we present wh^hay be the most simple example: the original Oil and Vinegar signature 
scheme (of [9]) was broken (see [5]), but if we have significantly more "vinegar 7 unknowns than "oil" 
unknowns (a definition of the "oil" and "vinegar" unknowns can be found in section 2), then the attack 
of [5] does not work and the security of this more general scheme is still an open problem. 
Moreover, we show that, when we have approximately vinegar unknowns for n oil unknowns, the 
security of the scheme is exactly equivalent (if we accept a natural but not proved property) to the 
problem of solving a random set of n quadratic equations in ^ unknowns (with no trapdoor). This is 
a nice result, since it suggests that some partial proof of security (related to some simple to describe 
and supposed very difficult to solve problems) might be found for some schemes with multivariate 
polynomials over a finite field. However, we show that most of the systems of n quadratic equations 
in n 2 (or more) variables can be solved in polynomial complexity... We also study Oil and Vinegar 
schemes of degree three (instead of two). 

2 The (Original and Unbalanced) Oil and Vinegar of degree two 

Let K — F q be a small finite field (for example K = F2). Let n and v be two integers. The message 
to be signed (or its hash) is represented as an element of A* n , denoted by y = (yx> .»,yn)- Typically, 
q n ~ 2 123 . The signature x is represented as an element of A n+V denoted by x = (zi, x n+v ). 

Secret key 

The secret key is made of two parts: 

1. A bijective and affine function s : A* n+V -f A" n+V . By "affine", we mean that each component of 
the output can be written as a polynomial of degree one in the n + u input unknowns, and with 

= coefficients in A". 

2. A set (5) of n equations of the following type: 

E vi, 1 < i < n, yi = J2 7tf**i a * + J2 A ^; a Jb + + £i a ; + * W • 

3 The coefiicients 7y*, Ay*, Q and <5 t - are the secret coefficients of these n equations. The 

2 values a x , a n (the a oiT unknowns) and a[ t a' v (the "vinegar" unknowns) lie in K. Note 

3 that these equations (*S) contain no terms in a^aj. 

Public key 

Let A be the element of A" n+V defined by A = (o lt a n , a' Xf a' v ). A is transformed into x = s~ l (A) t 
where 5 is the secret, bijective and affine function from K n+V to A n+V . 

Each value t/ t -, 1 < i < n, can be written as a polynomial P; of total degree two in the Xj unknowns, 
1 < j < n + v. We denote by {V) the set of these n equations:- 

Vi, 1 < i < n, Vi = Pi{x u x n+v ) {V). 

These n quadratic equations {V) (in the n + v unknowns Xj) are the public key. 

Computation of a signature (with the secret key) 

The computation of a signature x of y is performed as follows: 

Step 1: We find n unknowns a u .... a* of K and u unknowns a' lf a' w of K such that the n equations 
(<S) are satisfied. 

This can be done as follows: we randomly choose the t; vinegar unknowns and then we compute 
the a: unknowns from (5) by Gaussian reductions (because - since there are no a i a j terms - the 
(S) 'equations are affine in the a,- unknowns when the a\ are fixed. 



Remark: If we fi^fc solution, then we simply try again with^R random vinegar unknowns 
After very few tries, the probability of obtaining at [east one solution is very high, because 
the probability for a n x n matrix over F, to be invertible is not negligible.' (It is exactly 
(1 - ~ - prr)- For q = 2, this gives approximately 30 %, and for j > 2, this 

probability is even larger.) 

Step 2: We compute z = s~ l (.4), where A = (a x , a n , a' x , a'J. s is a signature of y. 
Public verification of a signature 

A signature x of y is valid if and only if all the (V) are satisfied. As a result, no secret is needed to 
check whether a signature is valid: this is an asymmetric signature scheme. 

Note: The name "Oil and Vinegar" comes from the fact that - in the equations (S) - the "oil 
unknowns" <z t - and the 'Vinegar unknowns" a'j are not all mixed together: there are no a,*a 7 * products. 
However, in (V), this property is hidden by the "mixing" of the unknowns by the s transformation. Is 
this property "hidden enough" ? In fact, this question exactly means: ~is the scheme secure ?". When 
v — n, we call the scheme "Original Oil and Vinegar", since this case was first presented in [9]. This 
case was broken in [5]. It is very easy to see that the cryptanalysis of [5]. also works, exactly in the 
same way, when v < n. However, the cases v > n are much more difficult. When v > n, we call the 
scheme "Unbalanced Oil and Vinegar". The analysis of such schemes is the topic of this paper. 

3 A short description of the attack of [5]: cryptanalysis of the case 
v = n 

The idea of the attack of [5] is essentially the following: 

In order to separate the oil variables and the vinegar variables, we look at the quadratic forms of the 
n public equations of (T 7 ), we omit for a while the linear terms. Let G: for 1 < i < n be the respective 
matrix of the quadratic form of Pi of the public equations (P). 

The quadratic part of the equations in the set (<S) is represented as a quadratic form with a corre- 
sponding 2n x 2n matrix of the form : ^ 5 ^ ) ' ^ Upper ^ 71 x n zero su bnaatrix is due to the 
fact that an oil variable is not multiplied by an oil variable. 

After hiding the internal variables with the linear function s, we get a representation for the matrices 



d = S ( ® ^ ] 5*, where 5 is an invertible 2n x 2n matrix. 
y Bi Ci J 

Definition 3.1: We define the oil subspace to be the linear subspace of all vectors in K 2n whose 
second half contains only zeros. 

Definition 3.2; We define the vinegar subspace as the linear subspace of all vectors in il 2n whose 
first half contains only zeros. 

Lemma 1 Let E and F be a 2n x 2n matrices with an upper left zero nxn submatriz. If F is invertible 
then the oil subspace is an invariant subspace of EF~ l . 

Proof: E and F map the oil subspace into the vinegar subspace. If F is invertible, then this mapping 
between the oil subspace and the vinegar subspace is one to one and onto (here we use the assumption 
that v - n). Therefore F~ l maps back the vinegar subspace into the oil subspace this argument 
explains why the oil subspace is transformed into itself by EF~ l . 



Definition 3.4: For an invertible matrix Gj } define Gij = GiGj l . 
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Definition 3.5: Let O^mhe image of the oil subspace bv S" 1 . 
In order to find the oil subspace, we use the following theorem: 

Theorem 3.1 O is a common invariant subspace of all the matrices G:j. 

Proof: 

^=*J:)^'(^;r- s (u:)(U')"V. 

The two inner matrices have the form of E and F in lemma 1. Therefore, the oil subspace is an invariant 
subspace of the inner term and O is an invariant subspace of G{G~ l . 

The problem of finding common invariant subspace of set of matrices is studied in [5]. Applying the 
algorithms in [5] gives us O. We then pick V to be an arbitrary subspace of dimension n*such°that 
V + 0 = A"* 71 , and they give an equivalent oil and vinegar separation. 

Once we have such a separation, we bring back the linear terms that were omitted, we pick random 
values for the vinegar variables and left with a set of n linear equations with n oil variables. 

Note: Lemma 1 is not true any more when v > n. The oil subspace is still mapped by E and F into 
the vinegar subspace. However F -1 does not necessary maps the image by E of the oil subspace back 
into the oil subspace and this is why the cryptanalysis of the original oil and vinegar is not valid for 
the unbalanced case. 

This corresponds to the fact that, if the submatrix of z'eros in the top left corner of F is smaller than 
nx n, then F~ l does not have (in general) a submatruc of zeros in the bottom right corner. For example: 




However, when v — n is small, we see in the next section how to extend the attack. 



□ 4 Cryptanalysis when v > n and v c±n 

In this section, we discuss the case of Oil and Vinegar schemes where v > ti, although a direct application 
. of the attack described in [5] and in the previous section does not solve the problem, a modification of 
the attack exists, that is applicable as long, as v — n is small. 

Definition 4.1: We define in this section the oil subspace to be the linear subspace of ail vectors in 
A" n+ . tf whose last v coordinates are only zeros. 

Definition 4.2: We define in this section the vinegar subspace to be the linear subspace of ail vectors 
in K 71 **" whose first n coordinates are only zeros. 

Here in this section, we start with the homogeneous quadratic terms of the equations: we omit the 
linear terms for a while. 
The matrices G; have the representation 




where the upper left matrix is the n x n zero matrix, A t - is a n x v matrix, B{ is a u x n matrix, C; is 
a v x v matrix and 5 is a (n -f- v) x (n -r v) invertible linear matrix. 
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Definition 4.3: Denne^Tto be [ ° ^ 

Lemma 2 For any matrix E that has the form ^ £ ) , tAe /offot«n ff Ao/di; 
£ transforms the oil subspace into the vinegar subspacs. 

b) If the matrix E' 1 exists, then the image of the vinegar subspace by E' 1 is a subspace of dimension 
v which contains the n-dimensional oil subspace in it. 

Proof: a) follows directly from the definition of the oil and vinegar subspaces. When a) is n ven 
then b) is immediate. ° 

The algorithm we propose is a probabilistic algorithm. It looks for an invariant subspace of the oil 
subspace after it is transformed by 5. -The probability for the algorithm to succeed on the first try is 
small. Therefore we need to repeat it with different inputs. We use the following property: any linear 

combination of the matrices E i} E n is also of the form ^ ^ ) ' 

The following theorem explains why an invariant subspace may exist with a certain probability. 

Theorem 4.1 Let F be an invertible linear combination of the matrices E x , ...,-E n . Then for any k 
such that Ej~ l exists, the matrix FE7 1 has a non trivial invariant subspace which is also a subspace of 
the oil subspace, with probability not less than 17* x for d = v — n. 

Proof: The matrix F maps the oil subspace into the vinegar subspace, the image by F of the oil 
subspace is mapped by E£ l into a subspace of dimension v that contains the oil subspace - these are 
due to lemma 1. We write v = n + d, where d is a small integer. The oil subspace and its image by 
FE7 1 are two subspaces with dimension n that reside in a subspace of dimension n + d. Therefore, 
their intersection is a subspace of the oil subspace with dimension not less than n — d. We denote the 
oil subspace by T Q and the intersection subspace by 7i. Now, we take the inverse images by FE7 1 of 
Ii: this is a subspace of 7b (the oil subspace) with dimension not less than n — d and the intersection 
between this subspace and Ii is a subspace of I t with dimension not less than n — 2d. We call this 
subspace I<i. We can continue this process and define It to be the intersection of Zj-t and its inverse 
image by FEk — l. These two sujbspaces have co-dimension not more than d in Therefore, It has 

a co-dimension not more than Id in It-i or a co-dimension not more than d in It-\- We can carry on 
this process as long as we are sure that the inverse image by FE7 1 of It has a non trivial intersection 
with It- This is ensured as long as the dimension of It is greater than but when the dimension is d 
or less than <f, there is no guaranty that these two subspaces - that reside in It-i - have a non trivial 
intersection. We end the process with It that has dimension < d that resides in It-i with dimension 
not more than 2d. 

We know that the transformation (EG7*)~ l maps It into It-i- With probability not less than £71^ > 
there is a non zero vector in It that is mapped to a non zero mutiple of itself - and therefore there is a 
non trivial subspace of FE& — 1 which is also a subspace of the oil subspace. 

Note: It is possible to get a better result for the expected number of eigenvectors and with much 
less effort: i\ is a subspace with dimension not less than n — d and is mapped by FE7 1 into a subspace 
with dimension n. The probability for a non zero vector to be mapped to a non zero multiple of itself 
is ^rrf. To get the expected value, we multiply it by the number of non zero vectors in It gives 

a value which is not less than ^ q ^ l ^n^ l ~^ * Since every eigenvector is counted q — 1 times, then the 



We define O as in section 3 and we get the following result for 0: 



expected number of invariant subspcaes of dimension 1 is not less than ~ q~ 



Theorem 4.2 Let F be an invertible linear combination of the matrices Gi; G n - Then for any k 
such that G^ 1 exists, the matrix FG7 1 has a non trivial invariant subspace, which is also a subspace 
of O with probability not less than 17^ for d— v — n. 



m 
rti 
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FC?^ 1 = (a l G l +...+a n (7 n )G* 1 = S(ai^i+...+ar n S n )S«(S t )-i£] ; -i5- 1 = 5(a 1 5 1 +...+a B E a )£r^5-i. 

The inner term is an invariant subspace of the oil subspace with the recuired probabilitv Therefore 
the same will hold for FG k , but instead of a subspace of the oil subspace, we get a subspace of O. ' 

How to find O ? 

We take a random linear combination of G u G n and multiply it by an inverse of one of the <?■ 
matrices. Then we calculate all the minimal invariant subspaces of this matrix (a minimal invariant 
subspace of a matrix A contains no non trivial invariant subspaces of the matrix A - these subspaces 
corresponds to irreducible factors of the characteristic polynomial of .4). This can be done in proba- 
bilistic polynomial time using standard linear algebra techniques. This matrix may have an invariant 
subspace wich is a subspace of 0. 

The following lemma enables us to distinguish between subspaces that are contained in O and random 
subspaces. 

Lemma 3 If H is a linear subspace and H CO, then for every x, y in H and every i, G,(x, y) = 0 
(here we regard G{ as a bilinear form) . 

Proof: There are x' and y 1 in the oil subspace such that x' = z5 -1 and y' = yS~ l . 

<7i(«.»).,s(«, SfJsV-C*-')^ £)(b'5-')5)w(» : £)<*)'-<>. 

The last term is zero because x' and'y' are in the oil subspace. 

This lemma gives a polynomial test to distinguish between subspaces of O and random subspaces. 
If the matrix we used has no minimal subspace which is also a subspace of O, then we pick another 
linear combination of G\ % G n , multiply it by an inverse of one of the Gk matrices and try again. 

After repeating this process approximately q~ dJrl times, we find with good probability at least one zero 
vector of O. We continue the process until we get n independent vectors of 0. These vectors span O . 
The expected complexity of the process is proportional to g~ 4 ' hl n 4 . We use here the expected number 
of tries until we find a non trivial invariant subspace and the term n* covers the computational linear 
algebra operations we need to perform for evey try. 

5 The cases v ~ \ (or v > y-) 
Property 

Let (A) be a random set of n quadratic equations in (n -f v) variables x L , r n+w . (By "random" we 
mean that the coefficients of these equations are uniformly and randomly chosen). When v ~ ^ (and 
more generally when v > ~-) t there is probably - for most of such (A) - a linear change of variables 
(r l; x n+v ) h-j- (x' Ll x^J such that the set (A') of (A) equations written in (x[ t x^ +v ) is an "Oil 
and Vinegar* system (i.e. there are no terms in x\ • Xy-with i <n and j < n). 

An argument to justify the property 
Let 

*i = Qfi.i^i + ^1,2^2 -r ... -f ax^+vZ^ 

By writing that the coefficient in all the n equations of [A) of all che x' • x^ (i < n and j < n) is zero, 
we obtain a system of n • n • :1 ~ quadratic equations in the (n -r u) • n variables o^j (1 < i < n + u, 
1 < j < n). Therefore, when v > approximately we may expect to have a solution for this system 
of equations for most of (A). 
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1. This argument is very natural, but this is not a complete mathematical proof. 

2. The system may have a solution, but finding the solution might be a difficult oroblem. This is 
why an Unbalanced Oil and Vinegar scheme might be secure (for -veil chosen parameters): there 
is always a linear change of variables that makes the problem easy to solve, but finding such a 
change of variables might be difficult. ° 

3. In section 7, we will see that, despite the result of this section, it is not recommended to choose 
v > n 2 . 

6 Solving a set of n quadratic equations in k unknowns, k > n, is 
NP-hard 

We present in section 7 an algorithm that solves in polynomial complexity more than 99% of the sets 
of n quadratic equations in n 2 (or more) variables (Le. it will probably succeed in more than 99% of 
the cases when the coefficients are randomly chosen). 

Roughly speaking, we can summarize this result by saying that solving a "random 7 " set of n quadratic 
equations in n 2 (or more) variables is feasible in polynomial complexity (and thus is not NP-hard if 
P NP). However, we see in the present section that the problem of solving any (i.e. 100%) set of n 
quadratic equations in k > n variables (so for example in k = n 2 variables) is NP-hard ! 
To see this, let us assume that we have a black box that takes any set of n quadratic equations with k 
variables in input, and that gives one solution when at least one solution exists. Then we can use this 
black box to find a solution for any set of n quadratic equations in n variables (and this is NP-hard). 
We proceed (for example) as follows. Let (A) be a set of (n — 1) quadratic equations with (n — 1) 
variables z it X2, i n -i. Then let y X i —> Ha be a more variables. 

Let (B) be the set of (A) equations plus one quadratic equation in yi, y a (for example the equation: 
(yi + ••• -i- Va) 2 = 1)* Then (23) is a set of exactly n quadratic equations in (n -f- 1 -f-a) variables. It is 
clear that from the solution of (B) we will immediately find one solution for (A). 

Note 1: (B) has a very special shape ! This is why .there is a polynomial algorithm for 99% of the 
•equations without contradicting the fact that solving these sets (B) of equations is a NP-hard problem. 

Note 2: For (B) } we can also add more than one quadratic equations in the y T - variables and we can 
linearly mix these equations with the equations of (A). In this case, (B) is still of very special form 
but this very special form is less obvious at first glance since all the variables z t - and yj are in all the 
equations of (B). 

7 A generally efficient algorithm for solving a random set of n quadratic 
equations in n 2 (or more) unknowns 

In this section, we describe an algorithm that solves a system of n randomly chosen quadratic equations 
in ti -f- u variables, when v > n-. 
Let (S) be the following system: 

OS) 



l<i<j<n±v l<t<n4-v 



, l<t<j<n-hv l<t<rt~v 

The main idea of the algorithm consists in using a change of variables such as: 

X i — #1,12/1. -f &2 t lV2 -T + Qfn.iKn -T Ctn+LlUn+l 4* + Qfn+v.ltfi+v 
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whose otj coefficients (for < n, 1 < j < n 4- v) are found step by slipf in order that the resulting 

system (S') (written with respect to these new variables y 1: --t yn+v) is easy Co solve. 



s^^L 



We begin by choosing randomly a ltl) .... a\ 



JT-f-V * 



• We then compute or^i, o?2,n+w such that [S f ) contains no y\y<: terms. This condition leads to 
a system of n linear equations on the (n -r v) unknowns a 2 j ( 1 < j < n -h u) : 



l<:*<;<n+t/ 



(1 < k < n). 



• We then compute 0:3,1, oc^^ v such that (<S') contains neither yiyz terms, nor yzyz terms. This 
condition is equivalent to the following system of In linear equations on the (n -f v) unknowns 
.03 J (1 <3 < n + v): 

E a ijkOti t :a 3 j = 0 (1 < k < n) 

l<:<j<n-fv 

E a t yjfccr2 tt -a 3 j = 0 (1 < k < n) 

l<:<j<n+v 



• ■ Finally, we compute ar n ,i, a n ,n+v such that (S') contains neither y x y n terms, nor y 2 y n terms, 
nor y n -iyn terms. This condition gives the following system of (n — l)n linear equations on 
the (n -j- u) unknowns a n j (1 < J < n-f-u): ' 



l<i<j<n-hv 



I 1<*<J<*+ V 



(1 < Jfc < n) 
(1 <.k < n) 



In general, all these linear equations provide at least one solution (found by Gaussian reductions). In 
particular, the last system of n{n - 1) equations and (n + v) unknowns generally gives a solution, as 
soon as n 4- v > n(n - 1), i.e. v > n(n - 2), which is true by hypothesis. 

/ <*ur \ / <*n,l \ 

Moreover, the n vectors : , ■ are very likely to be linearly independent for a 

\ <*I,n+v / V v / 

random quadratic system (*S). 

The remaining a id constants (i.e. those with n + 1 < i < n + u and 1 < j < n + 1) are randomly 
chosen, so as to obtain a bijective change of variables. 

By rewriting the system (S) with respect to these new variables y i% we are led to the following system: 
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E &>yr + yiLi f n(y n +i, yn+v) + ». + ynl^n^n+i, »m yn-v) +Qn(yn+i, y«+v) - 0 



where each £ t -,y is an affine function and each Q; is a quadratic function. 
We then compute y n -ri, yn+v such that: 

V:, 1 < i < n, V;, 1 < j < n + t/, Z;j(y n -ri, ■.•^nfv) = 0. 

This is possible because we have to solve a system of n 2 equations and v unknowns, which generally 
provides at least one solution, as long as v > nr. 
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ng system of n equations on the n unkSi^s 



It remains to solve the fo^^ing system of n equations on the n unkSl^s y Xl y n ; 

' £ Pay? = Xi 

on 



where A* = -Q*(yr.+i, t/n-hv) (1 < < n). 

In general, this gives the yf by Gaussian reduction. 

8 A variation with, twice smaller signatures 

In the UOV described in section 2, the public key is a set of n quadratic equations y: = P{(zi, r n + v ) r 
for 1 < i < n, where y = ("j^, y n ) is the hash value of the message to be signed. If we use a collision- 
free hash function, the hash value must at least be 123 bits long. Therefore, q n must be at least 2 123 , 
so that the typical length of the signature, if v = 2n, is at least 3 x 128 = 384 bits. 
As we see now, it is possible to make a small variation in the signature design in order to obtain twice 
smaller signatures. The idea is to keep the same polynomial P; (with the same associated secret key), 
but now the public equations that we check are: 

» ym ^li ••-» ^n+vj — U, 

where Li is a linear function in (x Ll x n +v) a-ad where the coefficients of Li are generated by a hash 
function in {y u —»yn)- 

For example Li{y\ t *i f — t r n-f-v) = a 1 z L +a 2 r 2 +...-fa^ +t ,x n+t ,, where (0:1,0:2, =Hash(y ll 
—1 yn||t)- Now, n can be chosen such that ? n > 2 s4 (instead ? n > 2 128 ). (Note: q n must be > 2 64 in 
order to avoid exhaustive search on a solution r). If v = 2n and g n ~ 2 64 , the length of the signature 
will be 3 X 64 = 192 bits. 

9 Oil and Vinegar of degree three 
9.1 The scheme 

The quadratic Oil and Vinegar schemes described in section 2 can easily be extended to any higher 
degree. We now present the schemes in degree three. 

Variables 

Let K be a small finite field (for example K ■= F 2 ). Let a Xt be n elements of K, called the 

"oil" unknowns. Let af x , af v be u elements. of K, called the "vinegar" unknowns. 

Secret key. 

The secret key is made of two parts: 

1. A bijective and affine function 5 : K n +* J -+ iv n+v . 

2. A set (S) of n equations of the following type: 

Vi < n, yi = Yl^ : '^ a ^ a t + 12^^ 

The coefficients <? ijk , m ikix A;/*, v ljkl £ y and 5 { are the secret coefficients of these n equations. 
Note that these equations (S) contain no terms in a/a*a* or in a.ja k : the equations are affine m 
the ay unknowns when the a/ k unknowns are fixed. 



.Public key- 




Let A be the element of K n ^ v defined by A = (a lt a^, . .4 is transformed into x = <?- l (.4), 

where 5 is the secret, bijective and affine function from K n ^ v to K n + V . Each value Vi, I < i < can 
be written as a polynomial F; of total degree three in the Xj unknowns, 1 < j < n + v. We denote bv 
(7?) the set of the following n equations: 

Vi, 1 <i<n,y ; = P t (r x n ^ v ) (P). 

These n equations (V) are the public key- 



Computation of a signature 

Let y be the message to be signed (or its hash value). 

Step 1: We randomly choose the v vinegar unknowns a' it and then we compute the a t * unknowns from (S) 
by Gaussian reductions (because - since there are no a t -ay terms - the (S) equations are aSne in 
the c.i unknowns when the a\ are fixed. (If we find no solution for this affine system of n equations 
and n "oil 77 unknowns, we just try again with new random 'Vinegar" unknowns.) 

Step 2: We compute x = 5" 1 (A), where A = {a it ..^a^a^, a: is a signature of y. 

Public verification of a signature 

A signature x of y is valid if and only if all the (V) are satisfied. 

9.2 First cryptanalysis of Oil and Vinegar of degree three when v <n 

We can look at the quadratic part of the public key and attack it exactly as for an Oil and Vinegar of 
degree two. This is expected to work when v < n. 

Note: If there is no quadratic part (i.e. is the public key is homogeneous of degree three), or if this 
attack does not work, then it is always possible to apply a random affine change of variables and to try 
again. Moreover, we will see in section '9.3 that, surprisingly, there is an even easier and more efficient 
attack in degree three than in degree two ! 

9.3 Cryptanalysis of OiLand Vinegar of degree three when v < (1 + V3)n and K is 
of characteristic ^ 2 (from an idea of D. Coppersmith, cf [1]) 

The key idea is' to detect a "linearity* in some directions. We search the set V of the values d - 
(di, <*n+v) such that: 

Vr, Vi, 1 < i < n, Pi(x + d) + Pi(x - cf) = 2P { {z) (#)■ 

By writing that each x k indeterminate has- a zero coefficient, we obtain n • (n + v) quadratic equations 
in the (rz + v) unknowns dj. 

(Each monomial x { xjx k gives {xj + dj) [x k + d k ){x t + d t ) + (*/ - dj)(x k " " di)-2xjx k x tl i.e. 

2(x j d k d e + x k d j d i + xzd j d k ).) ^ 
Furthermore, the cryptanalyst can specify about n - 1 of the coordinates d k of d, since the vectorial 
space of the correct d is of dimension n. It remains thus to solve n-(n+v) quadratic equations in (tH-1) 
unknowns dj. When v is not too large (typically when < n(n + u), i.e. when v < (1 + V3)n), 

this is expected to be easy. 

As a result, when v < approximately (1 + Vd)n and \K\ is odd, this gives a simple way to brealc tne 
scheme. 

Note 1: When v is sensibly greater than (1 + >/3)n (this is a more unbalanced limit than what we 
had in the quadratic case), we do not know at the present how to break the scheme. 
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Note 2: Strangdy eno^^his cryptaaalysis of degre three Oil ancJftgar schemes does not *-ork 
on degree two Oil and Vinegar schemes. The reason is that - in decree wo - wricin» 

Vx, Vi, 1 < i < n : P;{x -rd) + R : ( x _ <£) = 2P:(r) 

only gives n equations of degree two on the (n 4- t/) rfy unknowns (that we do not know how to <olve) 
(Each monomial xjx k gives {x j + dj)(z k + d k ) + (ry - d 7 )(x k - d k ) - 2r ; -s jfe( i.e. 2<f ; d t .) 



Note 3: 
to cover 



In degree two, we have seen that Unbalanced Oil and Vinegar public keys are expected 
almost all the set of n quadratic equations when v ~ ^. In degree three, we have a similar 
property: the public keys are expected to cover almost all the set of n cubic equations when va ^ 
(the proof is similar). 6 



10 Public key length 

It is always feasible to make some easy transformations on a public key in order to obtain the public key 
in a canonical way such that this canonical expression is slightly shorter than the original expression. 

First, it is always possible to publish only the homogeneous part of the quadratic equations (and not 
the linear part), because if we know the secret affine change of variables, then we can solve P(x) — y in 
an Oil and Vinegar scheme, we can also solve P(x) -f L(x) — y } where L is any linear expression with 
the same affine change of variables. It is thus possible to publish only the homogeneous part P and to 
choose a convention for computing the linear part L of the public key (instead of publishing L). For 
example, this convention can be that the linear terms of L in the equation number i (L < i < n) are 
computed from Hash(i|[/i) (or from Hash(:j|P)), where Hash is a public hash function and where Id 
is the identity of the owner of the secret key. 

On the equations, it is also possible to: 

1. Make linear and bijective changes of variable x* = A(r). 

2. Compute a linear and bijective transformation on the equation: V — t{V), (For example, the 
new first equation can be the old first plus the old third equation, etc). 

By combining easily these two transformations, it is always possible to decrease slightly the lenght of 
the public key. 

Idea 1: It is possible to make a change, of variables such that the first equation is in a canonical 
form (see [6], chapter 6) . With this presentation of the public key, the length of the public key will' be 
approximately times the initial length. 

Idea 2: Another idea is to use the idea of section 7, i.e. to create a square of A x A zeros in the 
coefficients, where A ~ V n +■ u - With this presentation, the lenght of the public key is approximately 

fn+ ^S" +v) times the initiaJ Ien s th - 

Remark: As we will see in section 12, the most efficient way of reducing, the length of the public 
key is to choose carefully the values q and n. 



11 Summary of the results 

The underlying field is K = F ? with q = p 771 . Its characteristic is p. 

u As difficult- as random" means that the problem of breaking the scheme is expected to be as difficult 
as the problem of solving a system of equations in u variables when the coefficients are randomly chosen 
(i.e. with no trapdoor). 



f 3 ? 
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Not broicen^^ as 




Degree 




Not Broken 


difficult as random 


Broken (despite as 
difficult as random) 


2 (for ail p) 


v < n 


n <v 


^ <v<n 2 


V > W 


-j (tor p — 2) | 


t> < (1 — VO)7l | 


(1 + \/3)n < v < V 

— : — — — 2 






3 (for p = 2) 1 


U < 71 | 




7f S u s ^ 


V > 71 4 



In this table, we have summarised our current results on the attacks on Unbalanced Oil and Vinegar 
schemes. The original paper ([5]) was only studying the case v = a for quadratic equations. 

12 Concrete examples of parameters 

In all the examples below, we do not know how to break the scheme. We have arbitrary chosen v = 2n 
(or v = 3n) in all" these examples (since v < n and v > n 2 are insecure). 

Example 1: ' K = F 2 , n = 128, v - 256 (or u = 334). The signature scheme is the one of section 
2. The length' of the public key is approximately n • ( (n ^ )2 ) bits. This gives here a huge value: 
approximately 1.1 Mbytes (or 2 Mbytes) ! The length of the secret key (the s matrix) is approximately 
(n-hu) 2 bits, i.e. approximately 18 Kbytes. However, this secret key can always be generated from a 
small secret seed of, say, 64 bits. 

Example 2: K = F 2l n = 64, v = 128 (or v = 192). The signature scheme is the one section 8. The 
length of the public key is 144 Kbytes (or 256 Kbytes). 

Example 3: K = Fis, n = 16, v — 32 (or v = 48). 5 is a secret affine bijection of F 16 . The signature 
scheme is the one section 8. The length of the public key is 9 Kbytes (or 16 Kbytes). 

Example 4: K = Fi 6 , n = 16, v = 32 (or v = 48). 5 is a secret affine bijection of Fis such that all 
its coefficients lie in F2. Moreover, the secret quadratic coefficients are also chosen in F2, so that the 
public functions P{, 1 < i < n, are n quadratic equations in. (n-i- u) unknowns of Fis, with coefficients 
in F2. In this case (the signature scheme is still the one of section 8), the length of the public key is 
2.2 Kbytes (or 4 Kbytes). 

Note: In ail these examples, n > 16 in order to avoid Grobner bases algorithms to find a solution r, 
and q 71 > 2 64 in order to avoid exhaustive search on r. 



13 Conclusion 

The original Oil and Vinegar signature algorithm had a very efficient cryptanalysis (cf [5]). Moreover, 
we have seen in this paper that Oil and Vinegar schemes are often not more secure in degree three than 
in degree two. However, surprisingly, some of the very simple variations called "Unbalanced Oil and 
Vinegar" described in this paper have so far resisted all attacks. The scheme is still very simple, very 
fast, and its parameters can be chosen in order to have a reasonable size for the public key. Its security 
is an open problem, but it is interesting to notice that - when the number of "vinegar unknowns'* 7 
becomes approximately (for n "oil unknowns") - then (if we accept a natural property) the scheme 
is as hard to break as a random set of n quadratic equations in ^ unknowns (with no trapdoor). 
This may give hope to obtain more concrete results of security on multivariate polynomial public key 
cryptography. 
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